RAW SEQUENCE LISTING DATE: 12/04/2001 

PATENT APPLICATION: US/09/973,025 TIME: 11:35:30 



Input Set : N:\Crf3\RULE60\09973025.txt 
Output Set: N:\CRF3\12042001\I973025.raw 



SEQUENCE LISTING 
3 (1) GENERAL INFORMATION: 

5 (i) APPLICANT: MAERTENS, GEERT 

6 BOSMAN, PONS 

7 DE MARTYNOFF, GUY 

8 BUYSE, MARIE -ANGE 

10 (ii) TITLE OF INVENTION: PURIFIED HEPATITIS C VIRUS ENVELOPE 

11 PROTEINS FOR DIAGNOSTIC AND THERAPEUTIC U 
13 (iii) NUMBER OF SEQUENCES: 111 

15 (iv) CORRESPONDENCE ADDRESS: 

16 (A) ADDRESSEE: NIXON & VANDERHYE P.C. 

17 (B) STREET: 1100 NORTH GLEBE ROAD g*** £^1°! 

18 (C) CITY: ARLINGTON l!!Ll^ " 

19 (D) STATE: VIRGINIA 

20 (E) COUNTRY-: U.S.A. 

21 (F) ZIP : 22201-4714 

23 (v) COMPUTER READABLE FORM: 

24 (A) MEDIUM TYPE: Floppy disk 

25 (B) COMPUTER: IBM PC compatible 

26 (C) OPERATING SYSTEM: PC-DOS/MS -DOS 

27 (D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 
29 (vi) CURRENT APPLICATION DATA: 

C--> 30 (A) APPLICATION NUMBER: US/09/9 73,025 

C--> 31 (B) FILING DATE: 10-Oct-2001 

32 (C) CLASSIFICATION: 435 

34 (vii) PRIOR APPLICATION DATA: 

35 (A) APPLICATION NUMBER: 08/612,973 

36 (B) FILING DATE: 1996-03-11 

37 (viii) ATTORNEY/AGENT INFORMATION: 

38 (A) NAME: BYRNE, ' THOMAS E. 

39 (B) REGISTRATION NUMBER: 32,205 

40 (C) REFERENCE/DOCKET NUMBER: 1487-10 

42 (ix) TELECOMMUNICATION INFORMATION: 

43 (A) TELEPHONE: (703) 816-4000 

44 (B) TELEFAX: (703) 816-4100 
48 (2) INFORMATION FOR SEQ ID NO : 1: 

50 (i) SEQUENCE CHARACTERISTICS: 

51 (A) LENGTH: 21 base pairs 

52 (B) TYPE: nucleic acid 

53 (C) STRANDEDNESS : single 

54 (D) TOPOLOGY: linear 
56 (ii) MOLECULE TYPE: cDNA 
58 (iii) HYPOTHETICAL: NO 

C--> 60 (iv) ANTI-SENSE: NO 

65 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 
67 GGCATGCAAG CTTAATTAAT T 
69 (2) INFORMATION FOR SEQ ID NO : 2: 
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71 (i) SEQUENCE CHARACTERISTICS: 

72 (A) LENGTH: 68 base pairs 

73 (B) TYPE: nucleic acid 

74 (C) STRANDEDNESS : single 

75 (D) TOPOLOGY: linear 
77 (ii) MOLECULE TYPE: cDNA 
79 (iii) HYPOTHETICAL: NO 

C--> 81 (iv) ANTI-SENSE: NO 

85 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: 

87 CCGGGGAGGC CTGCACGTGA TCGAGGGCAG ACACCATCAC CACCATCACT AATAGTTAAT 60 
89 TAACTGCA 6 8 

91 (2) INFORMATION FOR SEQ ID NO : 3: 

93 (i) SEQUENCE CHARACTERISTICS: 

94 (A) LENGTH: 642 base pairs 

95 (B) TYPE: nucleic acid 

96 (C) STRANDEDNESS: single 

97 (D) TOPOLOGY: linear 
99 (ii) MOLECULE TYPE: cDNA 
101 (iii) HYPOTHETICAL: NO 

C--> 103 (iv) ANTI-SENSE: NO 

106 (ix) FEATURE: 

107 (A) NAME/KEY: CDS 

108 (B) LOCATION: 1..639 

110 (ix) FEATURE: 

111 (A) NAME/KEY: mat_peptide 

112 (B) LOCATION: 1..636 

115 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: 

117 ATG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTA CTG TCC TGT 4 8 

118 Met Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys 

119 15 10 15 

121 CTG ACC ATT CCA GCT TCC GCT TAT GAG GTG CGC AAC GTG TCC GGG ATG 96 

122 Leu Thr lie Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Met 

123 20 25 30 

125 TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA 144 

126 Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala 

127 35 40 45 

129 GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT CGG GAG 192 

130 Ala Asp Met lie Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu 

131 50 55 60 

133 AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG CTC GCA GCT 240 

134 Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala 

135 65 70 75 80 

137 AGG AAC GCC AGC GTC CCC ACC ACG ACA ATA CGA CGC CAC GTC GAT TTG 288 

138 Arg Asn Ala Ser Val Pro Thr Thr Thr lie Arg Arg His Val Asp Leu 

139 85 90 95 

141 CTC GTT GGG GCG GCT GCT CTC TGT TCC GCT ATG TAC GTG GGG GAT CTC 336 

142 Leu Val Gly Ala Ala Ala Leu Cys Ser Ala Met Tyr Val Gly Asp Leu 

143 100 105 110 

145 TGC GGA TCT GTC TTC CTC GTC TCC CAG CTG TTC ACC ATC TCG CCT CGC 384 
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INFORMATION 


FOR 


SEQ 


ID NO : 4 : 
















X f O 




(i) 


SEQUENCE CHARACTERISTICS: 
















1 77 






(A) LENGTH: 212 amino 


acids 














X / O 






(B) TYPE: 


amino acid 


















1 7Q 
X / -? 






(D) TOPOLOGY: 


linear 


















1 ftl 
X OX 




(ii) 


MOLECULE TYPE: 


protein 


















1 ft 7 

X O -J 




(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO : 4: 
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INFORMATION FOR SEQ 


ID NO: 5: 


















9 "3 n 




(i) 


SEQUENCE CHARACTERISTICS: 


















9 71 






(A) LENGTH: 795 base pairs 
















9 "3 9 






(B) TYPE: nucleic 


acid 


















9 "3 7 
ZOO 






(C) STRANDEDNESS : 


single 


















z J4 






(D) TOPOLOGY: 


linear 




















O T £ 
Z JO 




( ii ) 


MOLECULE TYPE: 


cDNA 




















o 1 ft 
zoo 




(iii) 


HYPOTHETICAL: NO 






















240 




(iv) 


ANTI-SENSE: NO 
























9 A 7 




( IX) 


FEATURE : 
























9 A A 
Z *i ft 






(A) NAME/KEY: 


CDS 






















O A ^ 






(B) LOCATION: 


1. .792 




















9 A 7 
z ft / 




( ix) 


FEATURE : 
























9 A ft 
Z ft O 






(A) NAME/KEY: 


mat_peptide 


















9 A Q 
Z ft 3 






(B) LOCATION: 


1. .789 




















9 ^9 

Z JZ 




(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO : 5: 
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95 






278 


TAT 


GAG 


GCA GCG GAC ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


336 


279 


Tyr 


Glu 


Ala Ala Asp Met 


He 


Met 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 




280 






100 






105 










110 








282 


GTT 


CGG 


GAG AAC AAC TCT 


TCC 
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2 90 GTC GAT TTG CTC GTT GGG GCG GCT GCT TTC TGT TCC GCT ATG TAC GTG 480 

291 Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 

292 145 150 155 160 

294 GGG GAC CTC TGC GGA TCT GTC TTC CTC GTC TCC CAG CTG TTC ACC ATC 528 

295 Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr lie 

296 165 170 175 

298 TCG CCT CGC CGG CAT GAG ACG GTG CAG GAC TGC AAT TGC TCA ATC TAT 576 

299 Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser lie Tyr 

300 180 185 190 

302 CCC GGC CAC ATA ACG GGT CAC CGT ATG GCT TGG GAT ATG ATG ATG AAC 624 

303 Pro Gly His lie Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 

304 195 200 205 

306 TGG TCG CCT ACA ACG GCC CTG GTG GTA TCG CAG CTG CTC CGG ATC CCA 672 

307 Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg lie Pro 

308 210 215 220 

310 CAA GCT GTC GTG GAC ATG GTG GCG GGG GCC CAT TGG GGA GTC CTG GCG 720 

311 Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu Ala 

312 225 230 235 240 

314 GGT CTC GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT 768 

315 Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val Leu lie 

316 245 250 255 

318 GTG ATG CTA CTC TTT GCT CCC TAATAG 795 

319 Val Met Leu Leu Phe Ala Pro 

320 260 

323 (2) INFORMATION FOR SEQ ID NO: 6: 

325 (i) SEQUENCE CHARACTERISTICS: 

326 (A) LENGTH: 263 amino acids 

327 (B) TYPE: amino acid 

328 (D) TOPOLOGY: linear 
330 (ii) MOLECULE TYPE: protein 

332 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

334 Met Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 

335 15 10 15 

337 Val Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 

338 20 25 30 

340 Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 

341 35 40 45 

34 3 Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu 
344 50 55 60 

34 6 Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 
347 65 70 75 80 

34 9 Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser He Val 
350 85 90 95 

352 Tyr Glu Ala Ala Asp Met He Met His Thr Pro Gly Cys Val Pro Cys 

353 100 105 110 

355 Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 

356 115 120 125 

358 Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr He Arg Arg His 

359 130 135 140 
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L:30 M:220 C: Keyword misspelled or invalid format, [(A) APPLICATION NUMBER : ] 

L:31 M:220 C: Keyword misspelled or invalid format, [(B) FILING DATE:] 

L:60 M:220 C: Keyword misspelled or invalid format, [(iv) ANTI-SENSE:] 

L:81 M:220 C: Keyword misspelled or invalid format, [(iv) ANTI -SENSE:] 



L 


103 


M 


220 


C 


Keyword 


misspelled 


or 


invalid 


format , 


[(iv) 


ANTI 


-SENSE: ] 


L 


240 


M 


220 


C 


Keyword 


misspelled 


or 


invalid 


format, 


[(iv) 


ANTI 


-SENSE: ] 


L 


398 


M 


220 


C 


Keyword 


misspelled 


or 


invalid 


format, 


[(iv) 


ANTI 


-SENSE : ] 


L 


466 


M 


336 


w 


Invalid 


Amino Acid 


Number in Coding Region, 


SEQ 


ID: 7 


L 


535 


M 


220 


C 


Keyword 


misspelled 


or 


invalid 


format, 


[(iv) 


ANTI 


-SENSE : ] 


L 


586 


M 


336 


w 


Invalid 


Amino Acid 


Number in Coding Region, 


SEQ 


ID: 9 


L 


643 


M 


220 


C 


Keyword 


misspelled 


or 


invalid 


format, 


[(iv) 


ANTI 


-SENSE:] 


L 


752 


M 


220 


C 


Keyword 


misspelled 


or 


invalid 


format, 


t(iv) 


ANTI 


-SENSE: ] 


L 


890 


M. 


220 


C 


Keyword 


misspelled 


or 


invalid 


format , 


[(iv) 


ANTI 


-SENSE: ] 


L 


910 


M. 


220 


c 


Keyword 


misspelled 


or 


invalid 


format, 


[(iv) 


ANTI 


-SENSE: ] 


L 


930 


M, 


220 


c 


Keyword 


misspelled 


or 


invalid 


format, 


[(iv) 


ANTI 


-SENSE: ] 


L 


950 


M: 


220 


c 


Keyword 


misspelled 


or 


invalid 


format, 


[(iv) 


ANTI 


-SENSE: ] 


L 


970 


M, 


220 


c 


Keyword 


misspelled 


or 


invalid 


format , 


[(iv) 


ANTI 


-SENSE: ] 


L 


990 


M 


220 


c 


Keyword 


misspelled 


or 


invalid 


format, 


[(iv) 


ANTI 


-SENSE: ] 



L 


1010 


M 


220 


c 


Keyword 


misspelled 


or 


invalid 


format, 


[(iv) 


ANTI 


-SENSE 


] 


L 


1082 


M 


336 


w 


Invalid 


Amino Acid 


Number in Coding Region, 


SEQ 


ID:21 


L 


1154 


M 


220 


c 


Keyword 


misspelled 


or 


invalid 


format, 


[< 


; iv) 


ANTI 


-SENSE 




L- 


1277 


M 


220 


c 


Keyword 


misspelled 


or 


invalid 


format, 


[ 


[ iv) 


ANTI 


-SENSE 




L' 


1407 


M 


220 


c 


Keyword 


misspelled 


or 


invalid 


format, 


[ 


; iv) 
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